Toward the Scalable Integration of Internet
نویسنده
چکیده
This dissertation in a broad sense focuses on understanding the fundamental aspects of building a large-scale information integration system that can answer complex queries over a large number of heterogeneous Internet data sources. Among many challenges in achieving this goal, we focus on two key issues: efficient query processing and schema matching. Most of the data the integration system processes arrives in a stream from a remote source rather than residing on a local disk; we need to develop efficient query processing algorithms that work in this environment. We specifically investigate algorithms for evaluating sliding window joins over pairs of unbounded streams. We introduce a unit-time-basis cost model to analyze the expected performance of these algorithms. Using this cost model, we propose strategies for maximizing the efficiency of processing joins over unbounded streams. With respect to schema matching, we introduce an automated technique that works in the particularly difficult cases in which attribute names and data values are “opaque.” While schema matching has always been a problematic and interesting aspect of information integration, the problem is exacerbated as the number of information sources to be integrated grows. Our approach does not depend on any domain specific knowledge, and hence is applicable to many different domains, even including domains to which the system has not previously been exposed. Our approach works in two steps: 1) we measure the pair-wise attribute correlations in the tables to be matched and construct a dependency graph using mutual information as a measure of the ii dependency between attributes, and then 2) we find matching node pairs in the dependency graphs by running a graph matching algorithm. We validate our approach with an experimental study.
منابع مشابه
An Efficient Secret Sharing-based Storage System for Cloud-based Internet of Things
Internet of things (IoTs) is the newfound information architecture based on the internet that develops interactions between objects and services in a secure and reliable environment. As the availability of many smart devices rises, secure and scalable mass storage systems for aggregate data is required in IoTs applications. In this paper, we propose a new method for storing aggregate data in Io...
متن کاملIntelligent scalable image watermarking robust against progressive DWT-based compression using genetic algorithms
Image watermarking refers to the process of embedding an authentication message, called watermark, into the host image to uniquely identify the ownership. In this paper a novel, intelligent, scalable, robust wavelet-based watermarking approach is proposed. The proposed approach employs a genetic algorithm to find nearly optimal positions to insert watermark. The embedding positions coded as chr...
متن کاملThe Necessity of Integration of Internet of Things and Fog Computing for Providing Safe Healthcare Services
This article has no abstract.
متن کاملThe Necessity of Integration of Internet of Things and Fog Computing for Providing Safe Healthcare Services
This article has no abstract.
متن کاملThe Mediating Role of Emotion Regulation Difficulties in the Relationship between Family Communication Patterns with Tendency toward High-Risk Behaviors and Internet Addiction
This study aimed to investigate the mediating role of emotion regulation difficulties in the relationship between family communication patterns with the tendency toward high-risk behaviors and internet addiction. The statistical population included all students of the Azad University South Tehran Branch in the 1397-98 academic year. Using the convenience sampling method, 220 students entered th...
متن کاملSemantic Constraint and QoS-Aware Large-Scale Web Service Composition
Service-oriented architecture facilitates the running time of interactions by using business integration on the networks. Currently, web services are considered as the best option to provide Internet services. Due to an increasing number of Web users and the complexity of users’ queries, simple and atomic services are not able to meet the needs of users; and to provide complex services, it requ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003